Improvement of Tone Intelligibility for Average-Voice-Based Thai Speech Synthesis
نویسنده
چکیده
Problem statement: Tone intelligibility in speech synthesis is an important attribute that should be taken into account. The tone correctness of the synthetic speech is degraded considerably in the average-voice-based HMM-based Thai speech synthesis. The tying mechanism in the decision tree based context clustering without appropriate criterion causes unexpected tone neutralization. Incorporation of the phrase intonation to the context clustering process in the training stage was proposed early. However, the tone correctness is not satisfied. Approach: This study proposes a number of tonal features including tone-geometrical features and phrase intonation features to be exploited in the context clustering process of HMM training stage. Results: In the experiments, subjective evaluations of both average voice and adapted voice in terms of the intelligibility of tone are conducted. Effects on decision trees of the extracted features are also evaluated. By considering gender in training speech, two core experiments were conducted. The first experiment shows that the proposed tonal features can improve the tone intelligibility for female speech model above that of male speech model, while the second experiment shows that the proposed tonal features improve the tone intelligibility for gender dependent model than for gender independent model. Conclusion: All of the experimental results confirm that the tone correctness of the synthesized speech from the averagevoice-based HMM-based Thai speech synthesis is significantly improved when using most of the extracted features.
منابع مشابه
Towards the Development of Speaker-Dependent and Speaker-Independent Hidden Markov Model-Based Thai Speech Synthesis
Problem statement: Tone distortion in Thai languages can deteriorate not only the intelligibility of speech but also its naturalness. Therefore, the correctness of tone must be carefully taken into account in continuous speech synthesis. The preliminary work confronted this problem when applying HMM-based speech synthesis to Thai. Approach: This study presented a study on speaker-dependent and ...
متن کاملTone Question of Tree Based Context Clustering for Hidden Markov Model Based Thai Speech Synthesis
Problem statement: In HMM-based Thai speech synthesis, tone is an important issue that brings about the intelligibility of the synthesized speech. Tone distortion resulted from imbalance of the training data should be appropriately treated. Approach: This study described an HMM-based speech synthesis system for Thai language. In the system, spectrum, pitch and state duration are modeled simulta...
متن کاملStudy on Pitch Contour of Thai Polysyllabic Tone Sequences Using a Generative Model
Thai speech synthesis by rule has been developed. In order to synthesize F0 contours of Thai tones, the generative model of F0 contours (Fujisaki's model) for tonal languages is applied. Along with our method, the pitch contours of Thai polysyllabic words were analyzed. Rules are derived and applied to synthesize Thai polysyllabic tone sequences. We performed listening tests to evaluate intelli...
متن کاملGeneration of fundamental frequency contours for Thai speech synthesis using tone nucleus model
As classic and intrinsic requirements, synthetic speech need to convey correct information with good quality of naturalness to listeners. Fundamental frequency (F0) contours need to be controlled to meet these requirements. Additional challenges have been introduced to tonal languages because the F0 contour reflects both intelligibility and naturalness of the speech. According to the fact that ...
متن کاملImprovement of prosodic characteristic in Vietnamese speech synthesis system base on HMM
The key factors helping people to understand the synthesized voices of text-to-speech system are the naturalness and the intelligibility. However, making more natural voices remains a difficult task because of the speech data’s scarcity. With data limited corpus, prosodic information such as tone, intonation, Part-of-Speech is added to ensure the quality of synthetic speech. In the paper, we in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012